An Agents and Artifacts Approach to Distributed Data Mining
نویسندگان
چکیده
This paper proposes a novel Distributed Data Mining (DDM) approach based on the Agents and Artifacts paradigm, as implemented in CArtAgO [9], where artifacts encapsulate data mining tools, inherited from Weka, that agents can use while engaged in collaborative, distributed learning processes. Target hypothesis are currently constrained to decision trees built with J48, but the approach is flexible enough to allow different kinds of learning models. The twofold contribution of this work includes: i) JaCA-DDM: an extensible tool implemented in the agent oriented programming language Jason [2] and CArtAgO [10,9] to experiment DDM agent-based approaches on different, well known training sets. And ii) A collaborative protocol where an agent builds an initial decision tree, and then enhances this initial hypothesis using instances from other agents that are not covered yet (counter examples); reducing in this way the number of instances communicated, while preserving accuracy when compared to full centralized approaches.
منابع مشابه
Entropy-based Consensus for Distributed Data Clustering
The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...
متن کاملOutlier Detection in Wireless Sensor Networks Using Distributed Principal Component Analysis
Detecting anomalies is an important challenge for intrusion detection and fault diagnosis in wireless sensor networks (WSNs). To address the problem of outlier detection in wireless sensor networks, in this paper we present a PCA-based centralized approach and a DPCA-based distributed energy-efficient approach for detecting outliers in sensed data in a WSN. The outliers in sensed data can be ca...
متن کاملThe Role of Agents in Distributed Data Mining: Issues and Benefits
The increasing demand to extend data mining technology to data sets inherently distributed among a large number of autonomous and heterogeneous sources over a network with limited bandwidth has motivated the development of several approaches to distributed data mining and knowledge discovery, of which only a few make use of agents. We brie¤y review existing approaches and argue for the potentia...
متن کاملPrivacy-preserving agent-based distributed data clustering
A growing number of applications in distributed environment involve very large data sets that are inherently distributed among a large number of autonomous sources over a network. The demand to extend data mining technology to such distributed data sets has motivated the development of several approaches to distributed data mining and knowledge discovery, of which only a few make use of agents....
متن کاملDistributed Data Mining and Agent Mining Interaction and Integration: a Novel Approach
In recent years, more and more researchers have been involved in research on both agent technology and distributed data mining. A clear disciplinary effort has been activated toward removing the boundary between them, that is the interaction and integration between agent technology and distributed data mining. We refer this to agent mining as a new area. The marriage of agents and distributed d...
متن کامل